The USA Today Diversity Index

The USA TODAY Diversity Index is a number – on a scale from 0 to 100 – that represents the chance that two people chosen randomly from an area will be different by race and ethnicity. In more personal terms: “What is the chance that the next person I meet will be different from me?” A higher number means more diversity, a lower number, less. The index was invented in 1991 by Phil Meyer of the University of North Carolina and Shawn McIntosh of USA TODAY.

Exploratory Analysis

  1. At what level (State, County, City) is the American Community Survey data available? How is this different than the deccenial census?

Answer: The ACS has data available on State, County, and Tract level. Tract is way smaller than City level, but it is the closest to it. The same levels exist in the deccenial as well.

  1. What variable and variable codes are available to describe race and ethnicity in the US Census? Describe how these variables are represented in the data (Variables: B02001_001-B02001_006 & B03002_001-B03002_002 & B03002_012).

Answer: B02001_002-B02001_006 estimates the total of the population of a certain race. 002 is White Alone, 003 is Black or African American alone, 004 is American Indian and Alaska Native alone, 005 is Aisan alone, and 006 is Native Hawaiian and Other Pacific Islander alone. B02001_001 is the estimated total of all. B03002_001 is the estimated total in the calculation of hispanic or no hispanic ethnicity. B03002_002 is estimated total of people who are not hispanic or latino, and B03002_12 is the estimated total of hispanic and latino.

  1. How does the American Community Survey define race and ethnicity? Is this important information to report under assumptions for your analysis?

Answer: Race: It is a social definition within the country, the standards are set bu the US Office of Management and Budget. This division usually stems from where the person has origins from. Ethnicity: US Office of Management and Budget says it is a social definition which is seperate to the one on race. In this case, they only assume that there are only two different ethnicities: hispanic/latino and non-hispanic/latino. People who identifies themselves as hispanic/latino can also identifiy themselves as any of the races listed by OMB. This is important since there could potentially be some overlapping of people in different categories. Since ACS is talking about how the difference between race and ethnicity in social perception, it is harder to compare races with ethnicities since they can overlap and have different definitions.

  1. Does the American Community Survey provide the margin of error for their estimates of the proportion of the prevalence of each race and ethnicity? How might this impact the validity of our results?

Answer: Yes, there will be some room for slight difference between the result and the actual world. This means that there should be some caution making important decisions regarding the results. It is important to be transparent when discussing the result.

  1. Use the tidycensus API to assign the race and ethnicity data for New York, New Jersey and Connecticut (at the County level) to a data frame.
library(tidycensus)
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.4     v dplyr   1.0.2
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## Warning: package 'tibble' was built under R version 4.0.3
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
census_api_key("d339aa0082b0bc4697f233a571cd3cfee61f5c6d")
## To install your API key for use in future sessions, run this function with `install = TRUE`.
#Create vector with all the variables
var_race <- c(paste0("B02001_00", 1:6), paste0("B03002_00", 1:2), "B03002_012")

#Create vector with all the states
states <- c("NY", "CT", "NJ")

#Use the API to get the data
df <- get_acs(geography = "county", variables = var_race, state = states, year = 2018)
## Getting data from the 2014-2018 5-year ACS
#Display the first 5 rows
head(df)
## # A tibble: 6 x 5
##   GEOID NAME                          variable   estimate   moe
##   <chr> <chr>                         <chr>         <dbl> <dbl>
## 1 09001 Fairfield County, Connecticut B02001_001   944348    NA
## 2 09001 Fairfield County, Connecticut B02001_002   691642  2693
## 3 09001 Fairfield County, Connecticut B02001_003   107215  1230
## 4 09001 Fairfield County, Connecticut B02001_004     2338   505
## 5 09001 Fairfield County, Connecticut B02001_005    49927   908
## 6 09001 Fairfield County, Connecticut B02001_006      487   159

Computing The USA Today Diversity Index

Each of the calculations below will be done by county and not in aggregate.

Step 1:

In the current federal scheme, there are five named races – white, black/African-American, American Indian/Alaska Native, Asian and Native Hawaiian/Other Pacific Islander and an estimate for total population (B02001_001-B02001_006). Ensure that you have collected the proper data from the tidycensus API for these values, as well as the values for the Hispanic population (B03002_001-B03002_002 & B03002_012).

Use the spread function to create columns for each racial group (and the total population). Rename these columns to better reflect the data if you have not already done so.

Calculate each group’s share of the population. This is done by dividing the value for each racial column by the total population column. Create new variables for your data frame for each calculation.

\[ \small RaceProportion_i = \frac{Race_i}{Total_i} \]

#I need to start by saying this looks terrible from a code style perspective. I simply could not get it to work with better formatting for some reason.
#This block of code first transforms the different variables into columns in order to make the calculations easier. 
#Second, it changes the names of the variable columns into easier understandable names.
#Lastly, it does the necessary calculations in order to get the RaceProportion
df <- df %>%  pivot_wider(names_from = variable, values_from = c(estimate, moe)) %>% select(GEOID, NAME, paste0("estimate_", var_race)) %>% rename(TotalEstimateRace = estimate_B02001_001, WhiteEstimate = estimate_B02001_002, BlackEstimate = estimate_B02001_003, NativeEstimate = estimate_B02001_004, AsianEstimate = estimate_B02001_005, PacificEstimate = estimate_B02001_006, TotalEstimateEthnic = estimate_B03002_001, NotHispanicEstimate = estimate_B03002_002, HispanicEstimate = estimate_B03002_012) %>% mutate(WhiteProp = WhiteEstimate/TotalEstimateRace, BlackProp = BlackEstimate/TotalEstimateRace, NativeProp = NativeEstimate/TotalEstimateRace, AsianProp = AsianEstimate/TotalEstimateRace, PacificProp = PacificEstimate/TotalEstimateRace)


df
## # A tibble: 91 x 16
##    GEOID NAME  TotalEstimateRa~ WhiteEstimate BlackEstimate NativeEstimate
##    <chr> <chr>            <dbl>         <dbl>         <dbl>          <dbl>
##  1 09001 Fair~           944348        691642        107215           2338
##  2 09003 Hart~           894730        639528        122103           2945
##  3 09005 Litc~           183031        170083          3528            356
##  4 09007 Midd~           163368        144844          8654            264
##  5 09009 New ~           859339        634586        113737           1540
##  6 09011 New ~           268881        216974         15760           1477
##  7 09013 Toll~           151269        133656          4621             67
##  8 09015 Wind~           116538        103567          2644            711
##  9 34001 Atla~           268539        178914         40319            969
## 10 34003 Berg~           929999        663672         55497           1695
## # ... with 81 more rows, and 10 more variables: AsianEstimate <dbl>,
## #   PacificEstimate <dbl>, TotalEstimateEthnic <dbl>,
## #   NotHispanicEstimate <dbl>, HispanicEstimate <dbl>, WhiteProp <dbl>,
## #   BlackProp <dbl>, NativeProp <dbl>, AsianProp <dbl>, PacificProp <dbl>

Step 2:

Take each racial group’s share of the population, square it and sum the results.

\[ \small P(Racial_i) = \sum_{i=1}^{n} RaceProportion_i^2 \]

The Census also includes a category called “Some other race.” Because studies show that people who check it are overwhelmingly Hispanic, that category is not used. Hispanics’ effect on diversity is calculated in Step 3.

#Creating a new column called p_racial with the proper calculations. sum() did not work for me here, so I resorted back to simply adding them together manually. I know this is not best practice. 
df <- df %>% mutate(p_racial = WhiteProp^2 + BlackProp^2 + NativeProp^2 + AsianProp^2 + PacificProp^2)

df
## # A tibble: 91 x 17
##    GEOID NAME  TotalEstimateRa~ WhiteEstimate BlackEstimate NativeEstimate
##    <chr> <chr>            <dbl>         <dbl>         <dbl>          <dbl>
##  1 09001 Fair~           944348        691642        107215           2338
##  2 09003 Hart~           894730        639528        122103           2945
##  3 09005 Litc~           183031        170083          3528            356
##  4 09007 Midd~           163368        144844          8654            264
##  5 09009 New ~           859339        634586        113737           1540
##  6 09011 New ~           268881        216974         15760           1477
##  7 09013 Toll~           151269        133656          4621             67
##  8 09015 Wind~           116538        103567          2644            711
##  9 34001 Atla~           268539        178914         40319            969
## 10 34003 Berg~           929999        663672         55497           1695
## # ... with 81 more rows, and 11 more variables: AsianEstimate <dbl>,
## #   PacificEstimate <dbl>, TotalEstimateEthnic <dbl>,
## #   NotHispanicEstimate <dbl>, HispanicEstimate <dbl>, WhiteProp <dbl>,
## #   BlackProp <dbl>, NativeProp <dbl>, AsianProp <dbl>, PacificProp <dbl>,
## #   p_racial <dbl>

Step 3:

Because Hispanic origin is a separate Census question, the probability that someone is Hispanic or not must be figured separately. Take the Hispanic and non-Hispanic percentages of the population, square each and add them to get the chance that any two people will be Hispanic or not. Use this calculation to create a new variable in your data frame.

\[ \small P(Ethnic_i) = Hispanic_i^2+ Non Hispanic_i^2 \]

#Creating a new column called p_ethnic which is the same calculation as for p_racial just with ethnics instead
df <- df %>% mutate(p_ethnic = (HispanicEstimate/TotalEstimateEthnic)^2 + (NotHispanicEstimate/TotalEstimateEthnic)^2)

df
## # A tibble: 91 x 18
##    GEOID NAME  TotalEstimateRa~ WhiteEstimate BlackEstimate NativeEstimate
##    <chr> <chr>            <dbl>         <dbl>         <dbl>          <dbl>
##  1 09001 Fair~           944348        691642        107215           2338
##  2 09003 Hart~           894730        639528        122103           2945
##  3 09005 Litc~           183031        170083          3528            356
##  4 09007 Midd~           163368        144844          8654            264
##  5 09009 New ~           859339        634586        113737           1540
##  6 09011 New ~           268881        216974         15760           1477
##  7 09013 Toll~           151269        133656          4621             67
##  8 09015 Wind~           116538        103567          2644            711
##  9 34001 Atla~           268539        178914         40319            969
## 10 34003 Berg~           929999        663672         55497           1695
## # ... with 81 more rows, and 12 more variables: AsianEstimate <dbl>,
## #   PacificEstimate <dbl>, TotalEstimateEthnic <dbl>,
## #   NotHispanicEstimate <dbl>, HispanicEstimate <dbl>, WhiteProp <dbl>,
## #   BlackProp <dbl>, NativeProp <dbl>, AsianProp <dbl>, PacificProp <dbl>,
## #   p_racial <dbl>, p_ethnic <dbl>

Step 4:

To calculate whether two people are the same on both measures, multiply the results of the first two steps. Use this calculation to create a new column in your data frame. This is the probability that any two people are the SAME by race and ethnicity.

\[ \small P(Same_i) = P(Racial_i) \times P(Ethnic_i) \]

#Creating a new column called p_same which will be used for later calculations.
df <- df %>% mutate(p_same = p_racial * p_ethnic)

df
## # A tibble: 91 x 19
##    GEOID NAME  TotalEstimateRa~ WhiteEstimate BlackEstimate NativeEstimate
##    <chr> <chr>            <dbl>         <dbl>         <dbl>          <dbl>
##  1 09001 Fair~           944348        691642        107215           2338
##  2 09003 Hart~           894730        639528        122103           2945
##  3 09005 Litc~           183031        170083          3528            356
##  4 09007 Midd~           163368        144844          8654            264
##  5 09009 New ~           859339        634586        113737           1540
##  6 09011 New ~           268881        216974         15760           1477
##  7 09013 Toll~           151269        133656          4621             67
##  8 09015 Wind~           116538        103567          2644            711
##  9 34001 Atla~           268539        178914         40319            969
## 10 34003 Berg~           929999        663672         55497           1695
## # ... with 81 more rows, and 13 more variables: AsianEstimate <dbl>,
## #   PacificEstimate <dbl>, TotalEstimateEthnic <dbl>,
## #   NotHispanicEstimate <dbl>, HispanicEstimate <dbl>, WhiteProp <dbl>,
## #   BlackProp <dbl>, NativeProp <dbl>, AsianProp <dbl>, PacificProp <dbl>,
## #   p_racial <dbl>, p_ethnic <dbl>, p_same <dbl>

Step 5:

Subtract the result from 1 to get the chance that two people are different – diverse. For ease of use, multiply the result by 100 to place it on a scale from 0 to 100. Create a new column with your USA Today Diversity Index value.

\[ \small DiversityIndex_i = \Big( 1 - P(Same_i) \Big) \times 100 \]

#Create the column called DiversityIndex
df <- df %>% mutate(DiversityIndex = (1 - p_same)*100)

df
## # A tibble: 91 x 20
##    GEOID NAME  TotalEstimateRa~ WhiteEstimate BlackEstimate NativeEstimate
##    <chr> <chr>            <dbl>         <dbl>         <dbl>          <dbl>
##  1 09001 Fair~           944348        691642        107215           2338
##  2 09003 Hart~           894730        639528        122103           2945
##  3 09005 Litc~           183031        170083          3528            356
##  4 09007 Midd~           163368        144844          8654            264
##  5 09009 New ~           859339        634586        113737           1540
##  6 09011 New ~           268881        216974         15760           1477
##  7 09013 Toll~           151269        133656          4621             67
##  8 09015 Wind~           116538        103567          2644            711
##  9 34001 Atla~           268539        178914         40319            969
## 10 34003 Berg~           929999        663672         55497           1695
## # ... with 81 more rows, and 14 more variables: AsianEstimate <dbl>,
## #   PacificEstimate <dbl>, TotalEstimateEthnic <dbl>,
## #   NotHispanicEstimate <dbl>, HispanicEstimate <dbl>, WhiteProp <dbl>,
## #   BlackProp <dbl>, NativeProp <dbl>, AsianProp <dbl>, PacificProp <dbl>,
## #   p_racial <dbl>, p_ethnic <dbl>, p_same <dbl>, DiversityIndex <dbl>

Geo-spatial Analysis and Visualization

Be sure to properly label your plots and axes. Points will be deducted for incorrect plot titles or axes.

  1. Create a histogram of USA Today Diversity Index values. Describe the shape of the histogram in statistical terms (Hint: skewness).

Answer: #The data has a positive skewness. This means that there are more frequent counties in the dataset with lower diversity index score than counties with high.

#Creating a histogram with the frequency of the different Diversity Indexes from the counties
hist(df$DiversityIndex, main = "Histogram for USA Diversity Index", xlab = "Diversity Index", ylab = "Frequency")

  1. Create a visualization which compares the top 10 counties and their diversity index value using ggplot2.
library(ggplot2)

#Sorting the data to get the 10 counties with the highest Diversity Index
index_order <- order(df$DiversityIndex, decreasing = TRUE)
index_order <- index_order[1:10]

#Create a bar chart with the 10 counties with the highest Diversity Index
df_index_order <- df[index_order,c("NAME", "DiversityIndex")]
ggplot(df_index_order, aes(x=NAME, y=DiversityIndex)) + geom_col() + scale_x_discrete(guide = guide_axis(n.dodge=4)) + labs(title="The 10 Counties With the Gighest Diversity Index")

  1. Using the leaflet mapping library for R (or another mapping library of your choice), visualize the USA Today Diversity Index by county for New York, New Jersey and Connecticut.
library(mapview)
library(sp)
## Warning: package 'sp' was built under R version 4.0.3
#Since some of the calculations could not be done when the geometry columns existed because of variable length limitations, a new df is created with the geometry.
df1 <- get_acs(geography = "county", variables = var_race, state = states, year = 2018, geometry = TRUE)
## Getting data from the 2014-2018 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |                                                                      |   1%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |==                                                                    |   4%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |=========                                                             |  14%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |================                                                      |  24%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=====================                                                 |  31%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |============================                                          |  41%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |===================================                                   |  51%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |==========================================                            |  61%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |=================================================                     |  71%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |========================================================              |  81%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |===============================================================       |  91%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |=================================================================     |  94%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================|  99%
  |                                                                            
  |======================================================================| 100%
#Only keeping the relevant columns
df1 <- df1[, c(1, 2, 6)]

#Sorting df1 so it aligns with df in order to get the proper calculations to the proper rows
df1 <- df1 %>% arrange(GEOID)

#Get the DiversityIndex column from df and give those values to the corresponding row in df1
for (i in 1:nrow(df)) {
  for (j in 1:9) {
    df1[(9 * (i - 1) + j), "DiversityIndex"] = df[i, "DiversityIndex"]
    #print((9 * (i - 1) + j)) 
  
    }
}
#Create the heatmap of the sampled counties
mapview(df1, zcol = "DiversityIndex", legend = TRUE)
df1
## Simple feature collection with 819 features and 3 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -79.76215 ymin: 38.92852 xmax: -71.78699 ymax: 45.01585
## geographic CRS: NAD83
## First 10 features:
##    GEOID                          NAME                       geometry
## 1  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 2  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 3  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 4  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 5  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 6  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 7  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 8  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 9  09001 Fairfield County, Connecticut MULTIPOLYGON (((-73.21717 4...
## 10 09003  Hartford County, Connecticut MULTIPOLYGON (((-73.02054 4...
##    DiversityIndex
## 1        62.01604
## 2        62.01604
## 3        62.01604
## 4        62.01604
## 5        62.01604
## 6        62.01604
## 7        62.01604
## 8        62.01604
## 9        62.01604
## 10       62.23732
#I understand that there must be a more efficient way of doing this, but this was the way it worked for me.
  1. Display the following data in the “tooltip” when mousing over your plot: USA Today Diversity Index Value and County Name.
#Since I could not find a way to use (label = "COLUMN") to take more than one argument, I decided to create a new column which combines the two desired columns in order for (label = "COLUMN") to display the desired tooltip
df1$LabelDiversity <- paste(df1$NAME, df1$DiversityIndex)

mapview(df1, zcol = "DiversityIndex", legend = TRUE, label = "LabelDiversity")
  1. Does there appear to be any relationship between geography and diversity? Which state appears to be the most diverse?

Answer: It is clear that the counties in and around NYC are the most diverse ones. It is also seen that rural areas seem to have less diversity. This would make sense from a logical point of view where cities tend to be more diverse than rural communities.

Extra Credit

  1. Create a new data frame using the tidycensus API with data on median household income by county for New York, New Jersey and Connecticut. Join this data together with the data from New York County. Use ggplot2 (or another visualization library) to visualize the USA Today Diversity Index value and median household incomeon the same plot (Hint: try facet wrap!).

  2. Does there appear to be any relationship between median household income and diversity? How do counties differ on these two measures?

Answer: